Goto

Collaborating Authors

 early detection


A Clinically Interpretable Deep CNN Framework for Early Chronic Kidney Disease Prediction Using Grad-CAM-Based Explainable AI

Ayub, Anas Bin, Niha, Nilima Sultana, Haque, Md. Zahurul

arXiv.org Artificial Intelligence

Chronic Kidney Disease (CKD) constitutes a major global medical burden, marked by the gradual deterioration of renal function, which results in the impaired clearance of metabolic waste and disturbances in systemic fluid homeostasis. Owing to its substantial contribution to worldwide morbidity and mortality, the development of reliable and efficient diagnostic approaches is critically important to facilitate early detection and prompt clinical management. This study presents a deep convolutional neural network (CNN) for early CKD detection from CT kidney images, complemented by class balancing using Synthetic Minority Over-sampling Technique (SMOTE) and interpretability via Gradient-weighted Class Activation Mapping (Grad-CAM). The model was trained and evaluated on the CT KIDNEY DATASET, which contains 12,446 CT images, including 3,709 cyst, 5,077 normal, 1,377 stone, and 2,283 tumor cases. The proposed deep CNN achieved a remarkable classification performance, attaining 100% accuracy in the early detection of chronic kidney disease (CKD). This significant advancement demonstrates strong potential for addressing critical clinical diagnostic challenges and enhancing early medical intervention strategies.


Tackling a Challenging Corpus for Early Detection of Gambling Disorder: UNSL at MentalRiskES 2025

Thompson, Horacio, Errecalde, Marcelo

arXiv.org Artificial Intelligence

Gambling disorder is a complex behavioral addiction that is challenging to understand and address, with severe physical, psychological, and social consequences. Early Risk Detection (ERD) on the Web has become a key task in the scientific community for identifying early signs of mental health behaviors based on social media activity. This work presents our participation in the MentalRiskES 2025 challenge, specifically in Task 1, aimed at classifying users at high or low risk of developing a gambling-related disorder. We proposed three methods based on a CPI+DMC approach, addressing predictive effectiveness and decision-making speed as independent objectives. The components were implemented using the SS3, BERT with extended vocabulary, and SBERT models, followed by decision policies based on historical user analysis. Although it was a challenging corpus, two of our proposals achieved the top two positions in the official results, performing notably in decision metrics. Further analysis revealed some difficulty in distinguishing between users at high and low risk, reinforcing the need to explore strategies to improve data interpretation and quality, and to promote more transparent and reliable ERD systems for mental disorders.


Explainable AI For Early Detection Of Sepsis

Thakur, Atharva, Dhumal, Shruti

arXiv.org Artificial Intelligence

Department of Multidisciplinary Engineering (AI & DS) Vishwakarma Institute of Technology, Pune, 411037, Maharashtra, India Abstract - Sepsis is a potentially fatal medical disorder that needs to be identified and treated right away to avoid fatalities. It must be quickly identified and treated in order to stop it from developing into severe sepsis, septic shock, and multi-organ failure. Sepsis remains a significant problem for doctors despite advancements in medical technology and treatment methods. The beginning of the disease has been successfully predicted by machine learning models in recent years, but due to their black-box character, it is challenging to interpret these predictions and comprehend the underlying illness mechanisms. In this research, we propose a comprehensible AI method for sepsis analysis that combines machine learning with clinical knowledge and expertise in the domain. Our method allows clinicians to understand and verify the model's predictions based on clinical expertise and preexisting beliefs, in addition to providing precise predictions of the onset of sepsis. Keywords - Sepsis, Artificial Intelligence, Machine Learning, Explainable AI, Sensitivity Analysis I. INTRODUCTION As the world continues to advance in technology, the potential of artificial intelligence (AI) in healthcare is becoming more apparent.


Leveraging LLMs for Early Alzheimer's Prediction

Songdechakraiwut, Tananun

arXiv.org Artificial Intelligence

We present a connectome-informed LLM framework that encodes dynamic fMRI connectivity as temporal sequences, applies robust normalization, and maps these data into a representation suitable for a frozen pre-trained LLM for clinical prediction. Applied to early Alzheimer's detection, our method achieves sensitive prediction with error rates well below clinically recognized margins, with implications for timely Alzheimer's intervention.


PoTS: Proof-of-Training-Steps for Backdoor Detection in Large Language Models

Seddik, Issam, Souihi, Sami, Tamaazousti, Mohamed, Piergiovanni, Sara Tucci

arXiv.org Artificial Intelligence

As Large Language Models (LLMs) gain traction across critical domains, ensuring secure and trustworthy training processes has become a major concern. Backdoor attacks, where malicious actors inject hidden triggers into training data, are particularly insidious and difficult to detect. Existing post-training verification solutions like Proof-of-Learning are impractical for LLMs due to their requirement for full retraining, lack of robustness against stealthy manipulations, and inability to provide early detection during training. Early detection would significantly reduce computational costs. To address these limitations, we introduce Proof-of-Training Steps, a verification protocol that enables an independent auditor (Alice) to confirm that an LLM developer (Bob) has followed the declared training recipe, including data batches, architecture, and hyperparameters. By analyzing the sensitivity of the LLMs' language modeling head (LM-Head) to input perturbations, our method can expose subtle backdoor injections or deviations in training. Even with backdoor triggers in up to 10 percent of the training data, our protocol significantly reduces the attacker's ability to achieve a high attack success rate (ASR). Our method enables early detection of attacks at the injection step, with verification steps being 3x faster than training steps. Our results highlight the protocol's potential to enhance the accountability and security of LLM development, especially against insider threats.


Ensemble Deep Learning and LLM-Assisted Reporting for Automated Skin Lesion Diagnosis

Khan, Sher, Muhammad, Raz, Hussain, Adil, Sajjad, Muhammad, Rashid, Muhammad

arXiv.org Artificial Intelligence

Cutaneous malignancies demand early detection for favorable outcomes, yet current diagnostics suffer from inter-observer variability and access disparities. While AI shows promise, existing dermatological systems are limited by homogeneous architectures, dataset biases across skin tones, and fragmented approaches that treat natural language processing as separate post-hoc explanations rather than integral to clinical decision-making. We introduce a unified framework that fundamentally reimagines AI integration for dermatological diagnostics through two synergistic innovations. First, a purposefully heterogeneous ensemble of architecturally diverse convolutional neural networks provides complementary diagnostic perspectives, with an intrinsic uncertainty mechanism flagging discordant cases for specialist review -- mimicking clinical best practices. Second, we embed large language model capabilities directly into the diagnostic workflow, transforming classification outputs into clinically meaningful assessments that simultaneously fulfill medical documentation requirements and deliver patient-centered education. This seamless integration generates structured reports featuring precise lesion characterization, accessible diagnostic reasoning, and actionable monitoring guidance -- empowering patients to recognize early warning signs between visits. By addressing both diagnostic reliability and communication barriers within a single cohesive system, our approach bridges the critical translational gap that has prevented previous AI implementations from achieving clinical impact. The framework represents a significant advancement toward deployable dermatological AI that enhances diagnostic precision while actively supporting the continuum of care from initial detection through patient education, ultimately improving early intervention rates for skin lesions.


SINAI at eRisk@CLEF 2023: Approaching Early Detection of Gambling with Natural Language Processing

Marmol-Romero, Alba Maria, Plaza-del-Arco, Flor Miriam, Montejo-Raez, Arturo

arXiv.org Artificial Intelligence

This paper describes the participation of the SINAI team in the eRisk@CLEF lab. Specifically, one of the proposed tasks has been addressed: Task 2 on the early detection of signs of pathological gambling. The approach presented in Task 2 is based on pre-trained models from Transformers architecture with comprehensive preprocessing data and data balancing techniques. Moreover, we integrate Long-short Term Memory (LSTM) architecture with automodels from Transformers. In this Task, our team has been ranked in seventh position, with an F1 score of 0.126, out of 49 participant submissions and achieves the highest values in recall metrics and metrics related to early detection.


SINAI at eRisk@CLEF 2022: Approaching Early Detection of Gambling and Eating Disorders with Natural Language Processing

Marmol-Romero, Alba Maria, Jimenez-Zafra, Salud Maria, Plaza-del-Arco, Flor Miriam, Molina-Gonzalez, M. Dolores, Martin-Valdivia, Maria-Teresa, Montejo-Raez, Arturo

arXiv.org Artificial Intelligence

This paper describes the participation of the SINAI team in the eRisk@CLEF lab. Specifically, two of the proposed tasks have been addressed: i) Task 1 on the early detection of signs of pathological gambling, and ii) Task 3 on measuring the severity of the signs of eating disorders. The approach presented in Task 1 is based on the use of sentence embeddings from Transformers with features related to volumetry, lexical diversity, complexity metrics, and emotion-related scores, while the approach for Task 3 is based on text similarity estimation using contextualized word embeddings from Transformers. In Task 1, our team has been ranked in second position, with an F1 score of 0.808, out of 41 participant submissions. In Task 3, our team also placed second out of a total of 3 participating teams.


Branched Broomrape Detection in Tomato Farms Using Satellite Imagery and Time-Series Analysis

Narimani, Mohammadreza, Pourreza, Alireza, Moghimi, Ali, Farajpoor, Parastoo, Jafarbiglu, Hamid, Mesgaran, Mohsen

arXiv.org Artificial Intelligence

Branched broomrape (Phelipanche ramosa (L.) Pomel) is a chlorophyll-deficient parasitic plant that threatens tomato production by extracting nutrients from the host, with reported yield losses up to 80 percent. Its mostly subterranean life cycle and prolific seed production (more than 200,000 seeds per plant, viable for up to 20 years) make early detection essential. We present an end-to-end pipeline that uses Sentinel-2 imagery and time-series analysis to identify broomrape-infested tomato fields in California. Regions of interest were defined from farmer-reported infestations, and images with less than 10 percent cloud cover were retained. We processed 12 spectral bands and sun-sensor geometry, computed 20 vegetation indices (e.g., NDVI, NDMI), and derived five plant traits (Leaf Area Index, Leaf Chlorophyll Content, Canopy Chlorophyll Content, Fraction of Absorbed Photosynthetically Active Radiation, and Fractional Vegetation Cover) using a neural network calibrated with ground-truth and synthetic data. Trends in Canopy Chlorophyll Content delineated transplanting-to-harvest periods, and phenology was aligned using growing degree days. Vegetation pixels were segmented and used to train a Long Short-Term Memory (LSTM) network on 18,874 pixels across 48 growing-degree-day time points. The model achieved 88 percent training accuracy and 87 percent test accuracy, with precision 0.86, recall 0.92, and F1 0.89. Permutation feature importance ranked NDMI, Canopy Chlorophyll Content, FAPAR, and a chlorophyll red-edge index as most informative, consistent with the physiological effects of infestation. Results show the promise of satellite-driven time-series modeling for scalable detection of parasitic stress in tomato farms.


Early Detection of Pancreatic Cancer Using Multimodal Learning on Electronic Health Records

Aouad, Mosbah, Choudhary, Anirudh, Farooq, Awais, Nevers, Steven, Demirkhanyan, Lusine, Harris, Bhrandon, Pappu, Suguna, Gondi, Christopher, Iyer, Ravishankar

arXiv.org Artificial Intelligence

Pancreatic ductal adenocarcinoma (PDAC) is one of the deadliest c ancers, and early detection remains a major clinical challenge due to the absence of spec ific symptoms and reliable biomarkers. In this work, we propose a new multimodal appro ach that integrates longitudinal diagnosis code histories and routinely collected laborato ry measurements from electronic health records to detect PDAC up to one year prior to clin ical diagnosis. Our method combines neural controlled differential equations to model irregular lab time series, pretrained language models and recurrent networks to learn diagnosis code trajectory representations, and cross-attention mechanisms to capture in teractions between the two modalities. We develop and evaluate our approach on a real-world dat aset of nearly 4,700 patients and achieve significant improvements in AUC ranging from 6.5 % to 15.5% over state-of-the-art methods. Furthermore, our model identifies diagnosis codes and laboratory panels associated with elevated PDAC risk, including both established and new biomarkers.